A data mining approach to discover genetic and environmental factors involved in multifactorial diseases
نویسندگان
چکیده
In this paper, we are interested in discovering genetic and environmental factors that are involved in multifactorial diseases. Experiments have been achieved by the Biological Institute of Lille and many data has been generated. To exploit these data, data mining tools are required and we propose a two-phase optimisation approach using a speci®c genetic algorithm. During the ®rst step, we select signi®cant features with a speci®c genetic algorithm. Then, during the second step, we cluster affected individuals according to the features selected by the ®rst phase. The paper describes the speci®cities of the genetic problem that we are studying, and presents in detail the genetic algorithm that we have developed to deal with this very large size feature selection problem. Results on both arti®cial and real data are presented. q 2001 Elsevier Science Ltd All rights reserved.
منابع مشابه
چشم اندازی به نقش عوامل ژنتیکی و محیطی در بروز آسم
Background and purpose: Asthma is a chronic inflammatory disease of the airways that is caused by hypersensitivity to environmental allergens. Symptoms of asthma include shortness of breath, airway hyper-responsiveness, wheezing, and cough. The disease might vary from a mild to severe and intermittent to chronic disease. Asthma is known as a multifactorial disease due to the interaction of gene...
متن کاملData Mining for Genetics: A Genetic Algorithm Approach
MINING biological data is an emerging area of intersection between data mining and bioinformatics. Bio-informaticians have been working on the research and development of computational methodologies and tools for expanding the use of biological, medical, behavioral, or health-related data. Biological data mining aims to extract significant information from DNA, RNA and proteins. Many biological...
متن کاملFeature Selection in Data-Mining for Genetics Using Genetic Algorithm
We discovered genetic features and environmental factors which were involved in multifactorial diseases. To exploit the massive data obtained from the experiments conducted at the General Hospital, Chennai, data mining tools were required and we proposed a 2-Phase approach using a specific genetic algorithm. This heuristic approach had been chosen as the number of features to consider was large...
متن کاملStructural analysis of impacting factors of sustainable development in underground coal mining using DEMATEL method
Mining can become more sustainable by developing and integrating economic, environmental, and social components. Among the mining industries, coal mining requires paying a serious attention to the aspects of sustainable development. Therefore, in this work, we investigate the impacting factors involved in the sustainable development of underground coal mining from the structural viewpoint. For ...
متن کاملData Mining in Genome Wide Association Studies
The genetic basis for some human diseases, in which one or a few genome regions increase the probability of acquiring the disease, is fairly well understood. For example, the risk for cystic fibrosis is linked to particular genomic regions. Identifying the genetic basis of more common diseases such as diabetes has proven to be more difficult, because many genome regions apparently are involved,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Knowl.-Based Syst.
دوره 15 شماره
صفحات -
تاریخ انتشار 2002